AITopics | data optimization

Collaborating Authors

data optimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Collaborative Unlabeled Data Optimization

Shang, Xinyi, Sun, Peng, Liu, Fengyuan, Lin, Tao

arXiv.org Artificial IntelligenceOct-13-2025

This paper pioneers a novel data-centric paradigm to maximize the utility of unlabeled data, tackling a critical question: How can we enhance the efficiency and sustainability of deep learning training by optimizing the data itself? We begin by identifying three key limitations in existing model-centric approaches, all rooted in a shared bottleneck: knowledge extracted from data is locked to model parameters, hindering its reusability and scalability. To this end, we propose CoOpt, a highly efficient, parallelized framework for collaborative unlabeled data optimization, thereby effectively encoding knowledge into the data itself. By distributing unlabeled data and leveraging publicly available task-agnostic models, CoOpt facilitates scalable, reusable, and sustainable training pipelines. Extensive experiments across diverse datasets and architectures demonstrate its efficacy and efficiency, achieving 13.6% and 6.8% improvements on Tiny-ImageNet and ImageNet-1K, respectively, with training speedups of $1.94 \times $ and $1.2 \times$.

artificial intelligence, machine learning, participant, (18 more...)

arXiv.org Artificial Intelligence

2505.14117

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

ADO: Automatic Data Optimization for Inputs in LLM Prompts

Lin, Sam, Hua, Wenyue, Li, Lingyao, Wang, Zhenting, Zhang, Yongfeng

arXiv.org Artificial IntelligenceFeb-16-2025

This study explores a novel approach to enhance the performance of Large Language Models (LLMs) through the optimization of input data within prompts. While previous research has primarily focused on refining instruction components and augmenting input data with in-context examples, our work investigates the potential benefits of optimizing the input data itself. We introduce a two-pronged strategy for input data optimization: content engineering and structural reformulation. Content engineering involves imputing missing values, removing irrelevant attributes, and enriching profiles by generating additional information inferred from existing attributes. Subsequent to content engineering, structural reformulation is applied to optimize the presentation of the modified content to LLMs, given their sensitivity to input format. Our findings suggest that these optimizations can significantly improve the performance of LLMs in various tasks, offering a promising avenue for future research in prompt engineering. The source code is available at https://anonymous.4open.science/r/ADO-6BC5/

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.11436

Country: North America (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Data Pipeline Training: Integrating AutoML to Optimize the Data Flow of Machine Learning Models

Wu, Jiang, Wang, Hongbo, Ni, Chunhe, Zhang, Chenwei, Lu, Wenran

arXiv.org Artificial IntelligenceFeb-20-2024

Data Pipeline plays an indispensable role in tasks such as modeling machine learning and developing data products. With the increasing diversification and complexity of Data sources, as well as the rapid growth of data volumes, building an efficient Data Pipeline has become crucial for improving work efficiency and solving complex problems. This paper focuses on exploring how to optimize data flow through automated machine learning methods by integrating AutoML with Data Pipeline. We will discuss how to leverage AutoML technology to enhance the intelligence of Data Pipeline, thereby achieving better results in machine learning tasks. By delving into the automation and optimization of Data flows, we uncover key strategies for constructing efficient data pipelines that can adapt to the ever-changing data landscape. This not only accelerates the modeling process but also provides innovative solutions to complex problems, enabling more significant outcomes in increasingly intricate data domains. Keywords- Data Pipeline Training;AutoML; Data environment; Machine learning

data pipeline, optimization, pipeline, (13 more...)

arXiv.org Artificial Intelligence

2402.12916

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.14)
North America > United States > Texas > Dallas County > Richardson (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Data Optimization in Deep Learning: A Survey

Wu, Ou, Yao, Rujing

arXiv.org Artificial IntelligenceOct-25-2023

Large-scale, high-quality data are considered an essential factor for the successful application of many deep learning techniques. Meanwhile, numerous real-world deep learning tasks still have to contend with the lack of sufficient amounts of high-quality data. Additionally, issues such as model robustness, fairness, and trustworthiness are also closely related to training data. Consequently, a huge number of studies in the existing literature have focused on the data aspect in deep learning tasks. Some typical data optimization techniques include data augmentation, logit perturbation, sample weighting, and data condensation. These techniques usually come from different deep learning divisions and their theoretical inspirations or heuristic motivations may seem unrelated to each other. This study aims to organize a wide range of existing data optimization methodologies for deep learning from the previous literature, and makes the effort to construct a comprehensive taxonomy for them. The constructed taxonomy considers the diversity of split dimensions, and deep sub-taxonomies are constructed for each dimension. On the basis of the taxonomy, connections among the extensive data optimization methods for deep learning are built in terms of four aspects. We probe into rendering several promising and interesting future directions. The constructed taxonomy and the revealed connections will enlighten the better understanding of existing methods and the design of novel data optimization techniques. Furthermore, our aspiration for this survey is to promote data optimization as an independent subdivision of deep learning. A curated, up-to-date list of resources related to data optimization in deep learning is available at \url{https://github.com/YaoRujing/Data-Optimization}.

data optimization, deep learning

arXiv.org Artificial Intelligence

2310.16499

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ERP Systems: How It Benefits From Artificial Intelligence - AI Summary

#artificialintelligenceNov-8-2021, 01:50:08 GMT

AI can help further ameliorate this aspect of the business in more than one impact way; for example, it can assist companies with data optimization, i.e. ensuring all their data is not only updated but also optimized and complete. ERP solutions fortified with AI are also able to help companies close any gaps between various departments within the organization, empower executives to make sound, data-driven decisions, and so much more. AI-driven ERP solutions, then, offer solutions such as chatbots that can quickly learn from the company's data and then use it to improve customers' journeys and experiences with the brand. Plus, when you find a trusted provider for enterprise software development services, you will also have the requisite expertise that will further serve to ensure the success of your endeavor to fortify your ERP solution with AI. Not only that -- researchers have also found that as many as 83 percent of companies believe AI is critical to the success of their endeavors to ensure their business growth.

artificial intelligence, endeavor, erp solution, (13 more...)

#artificialintelligence

Technology:

Information Technology > Enterprise Applications > Enterprise Resource Planning (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

How to Bring Your ML Models to Production Faster

#artificialintelligenceOct-14-2021, 13:10:39 GMT

After different AI projects, I realized how quickly building efficient Machine Learning models is becoming a core competency for companies to compete more effectively. Decision-makers are learning that managing the whole lifecycle of building, deploying, and debugging models within their existing tech stack is not straightforward and brings a new set of challenges. Based on my experience, data scientists often spend time analyzing a dataset, look for suitable algorithms, train a new model, then hand it over to data engineers to run in production. This separation can lead to problems where data scientists don't see the challenges of running the model in production, and data engineers don't know how the models are structured. I have seen many times data scientists writing applications that don't scale in production.

data scientist, ml project, pipeline, (12 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AI Series: Part 1 - AI vs. Machine Learning vs. Deep Learning

#artificialintelligenceJan-1-2018, 00:46:18 GMT

How smart are you when it comes to the nuances of Artificial Intelligence (AI)? When you read about the future of AI, it can seem like there are a lot of buzzwords being thrown around in the media. Differentiating among AI, machine learning and deep learning technologies can be confusing, especially when terms are being used interchangeably. Let's begin by clearing things up with a few definitions. First, there is AI, which refers to intelligence exhibited by machines in the form of human cognitive functions like visual perception, speech recognition, decision-making, and language translation.

artificial intelligence, data optimization, machine learning, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback